Back

Life Science Alliance

Life Science Alliance, LLC

Preprints posted in the last 7 days, ranked by how well they match Life Science Alliance's content profile, based on 263 papers previously published here. The average preprint has a 0.07% match score for this journal, so anything above that is already an above-average fit.

1
Transcriptomic Architecture of Type 2 Diabetes in Human Pancreatic Islets:An Integrative Meta-Analysis and Machine Learning Framework for Biomarker Discovery

Romero, R.

2026-06-10 endocrinology 10.64898/2026.06.08.26355184 medRxiv
Top 0.5%
1.5%
Show abstract

Background. Type 2 diabetes mellitus (T2D) is defined by progressive pancreatic {beta}-cell dysfunction whose molecular underpinnings remain incompletely understood. Single-cohort transcriptomic analyses of donor islets have yielded heterogeneous gene lists of limited cross-study reproducibility, constraining both mechanistic interpretation and biomarker development. Methods. We combined two complementary analytical strategies applied to four public human islet transcriptomic cohorts (GSE25724, GSE20966, GSE38642, and GSE164416; n = 7-57 donors per contrast). For the integrative arm, three microarray datasets and one bulk RNA-seq dataset were processed independently and unified through gene-level random-effects meta-analysis, hallmark pathway scoring (GSVA/MSigDB), and iterative module refinement, yielding a two-axis disease framework. For the diagnostic arm, a consensus multi-method machine learning pipeline, combining LASSO penalized logistic regression, Support Vector Machine Recursive Feature Elimination (SVM-RFE), and Random Forest importance scoring, was applied to 184 differentially expressed genes from the RNA-seq cohort, with all normalization steps performed within leave-one-out cross-validation (LOOCV) folds to prevent data leakage. Machine learning classification of the RNA-seq cohort was additionally subjected to external transportability testing in the independent bulk human islet RNA-seq cohort GSE50244 using an overlap-restricted reduced score and a threshold fixed in the discovery cohort. Results. Meta-analysis across all four cohorts identified 337 high-confidence T2D-associated genes (96.1% directional concordance in beta-cell-enriched tissue). These were distilled into two refined 14-gene modules: ImmuneStress (MICB, HLA-DRA, HLA-DPA1, IL1R2, and others) and BetaCellIdentitySecretion (RASGRP1, PPP1R1A, SLC2A2, and others), whose composite IsletDysfunctionScore provided the most stable cross-platform separation of non-diabetic from T2D islets (Hedges' g = 1.80, p = 9.83 x $10^-17$, $\text{I}^2$= 0%). Consistent with progressive disease, IsletDysfunctionScore increased monotonically from non-diabetic to impaired glucose tolerance to T2D. Separately, the machine learning pipeline derived a 10-gene diagnostic panel: GABRA2, SLC2A2, ARG2, DKK3, PRIMA1, TAFA4, HHATL, PARVG, RNU1-70P, and the novel lncRNA ENSG00000284653, that achieved perfect discrimination in LOOCV (AUC = 1.000, sensitivity = 1.000, specificity = 1.000, zero misclassifications across all 57 donors). A leakage-verification experiment confirmed that this performance reflected genuine biological signal: global quantile normalization prior to cross-validation collapsed AUC to 0.380. External testing showed that 8 of the 10 panel genes were measurable in GSE50244. The frozen 8-gene reduced score retained strong discrimination (external AUC = 0.907), with 6 of 8 genes preserving directional concordance, but the discovery-derived threshold did not transfer because the external score distribution was shifted upward and compressed, yielding complete sensitivity but zero specificity at the frozen cutoff Conclusions. Integrating pathway-level meta-analysis with machine learning classification, we present a coherent two-axis model: immune/stress activation and loss of beta-cell identity/secretory competence, together with a compact, biologically interpretable 10-gene diagnostic signature. Panel genes converge on GABA signaling, glucose transport, arginine metabolism, WNT pathway inhibition, and a novel lncRNA, providing both mechanistic hypotheses and high-priority targets for external validation. These findings offer a reproducible transcriptomic scaffold for future mechanistic, biomarker, and clinical translation studies of human islet dysfunction. They also support external transportability of the core biological signal, while indicating that absolute operating thresholds are cohort-dependent and would require recalibration before deployment in independent datasets.

2
Investigating the Y chromosome in complex disease: Phenome-wide scan across 104,334 Finnish men

Preussner, A.; Leinonen, J. T.; FinnGen, ; Pirinen, M.; Tukiainen, T.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26355235 medRxiv
Top 2%
0.7%
Show abstract

Although the Y chromosome represents roughly 2% of the male genome, it is often ignored in genome-wide association studies (GWAS). Subsequently, the potential health impacts of Y-chromosomal genetic variation remain incompletely understood. To fill this gap, we performed a phenome-wide association study (PheWAS) in FinnGen across 1,426 binary and quantitative traits using Y-chromosomal variation (frequency [&ge;] 1%) in 104,334 genotyped men. As Y chromosome variation is prone to population stratification, we performed carefully adjusted association analyses and further examined these through kin-based validation in 19,275 female and 24,712 male 1st degree relatives. We found 121 suggestive (p < 5.6x10-3) phenotypic associations in the Y chromosome, yet none of these were strong enough to reach phenome-wide significance (p < 3.9x10-6). While only 38 associations were supported in the kin-based validation, intriguingly we found support for a previously suggested link between haplogroup I1 and coronary heart disease (CHD; OR=1.06, 95%CI=1.02-1.11, p=3.7x10-3; male validation OR=1.05; female validation OR=0.97). The I1-CHD association was detected across distinct geographical areas within Finland and was independent from Loss of Y (LOY) and the autosomal risk to CHD, proposing a link between germline Y-chromosomal variation and heart disease risk. Overall, this study presents a comprehensive phenome-wide analysis of Y-chromosomal associations, highlighting the potential relevance of Y-chromosomal variation beyond sex determination. Our findings further emphasize the need for improved capture of Y-chromosomal variants and further analyses in biobank-scale data to allow for deeper exploration of male-specific genetic architecture of complex diseases.

3
Beyond External Load: Integrative Immune Monitoring Reveals Injury-Predictive Signals in the Athlete's Internal State

Heyn, H.; Perron, U.; Rodas, G.; Mendizabal Sasieta, A.; Grzelak, M.; Soto, M.; Capelli, M.; Martin-Garcia, A.; Mallol, M.; Pruna, R.; Gomez-Chereguini, L.

2026-06-11 sports medicine 10.64898/2026.06.06.26354898 medRxiv
Top 2%
0.7%
Show abstract

Abstract (already in the PDF; paste if a box is required): Injury risk prediction in elite football relies almost exclusively on external load metrics derived from GPS tracking, overlooking the molecular state of the athlete. We monitored 26 male players from FC Barcelona's first team across the 2025 calendar year, integrating GPS-derived training load with longitudinal blood-based immune monitoring (systemic inflammation and TCR-derived immune age). Immune age acceleration and inflammation were elevated in the 14 days preceding musculoskeletal injuries. A logistic regression model combining external load, inflammation, immune age acceleration, and career injury history reached an overall AUC of 0.678 and a mean per-player AUC of 0.754 (SD 0.146), improving on a GPS-only baseline of 0.541. Applied to 2026 data, the frozen model ranked players who later sustained non-contact musculoskeletal injuries high in the risk distribution. Together, our data suggest multimodal immune monitoring in elite football to reveal the athlete's internal physiological state, which carries injury-relevant information that external load alone does not capture.

4
Host Genetic Regulation of NLRP3 Inflammasome Cytokines Reveals Immune and Vascular Pathways in HIV

Chung, R.; Chalasani, N. S.; Barbehenn, A. S.; Lundgren, E.; Savur, S.; Shome, S.; Sheikhzadeh, C. H.; Sarvadhavabhatla, S.; Donaire, M. S.; Pae, V.; Chu, X.; Winder, D.; Maguire, C. T.; Topal, S.; Ganesan, A.; Yabes, J. M.; Larson, D. T.; Lalani, T.; Ewers, E. C.; Colombo, R. E.; Dugan, E.; Rathore, U.; Marson, A.; Agan, B. K.; Tomalka, J. A.; Sekaly, R.-P.; Loannidis, N. M.; Lee, S. A.

2026-06-10 hiv aids 10.64898/2026.06.08.26355202 medRxiv
Top 4%
0.4%
Show abstract

People with HIV exhibit elevated inflammation and cardiovascular risk despite antiretroviral therapy. To define the genetic architecture of inflammasome-associated inflammation, we performed whole-genome sequencing and quantified plasma IL-6, IL-1{beta}, and IL-18 in 1,000 ART-suppressed PWH from the U.S. Military HIV Natural History Study. Genome-wide analyses identified 14 loci implicating antiviral defense (DDX17, DDX41, EEA1, BCL11A), lipid metabolism (ABCA1, ABCA12, ABCC1, AGMO), and vascular remodeling (KLHL29, RNF213, ETV1). Transcriptome-wide analyses across cardiovascular and immune tissues identified regulatory programs linking interferon signaling, immune activation, and vascular biology to circulating cytokine levels. Mendelian randomization analyses supported causal relationships between inflammasome-associated cytokines and vascular events. Functional integration with genome-wide CRISPR perturbation datasets in primary CD4 T cells linked cytokine-associated loci to HIV antiviral pathways and cytokine regulatory networks. External validation in cohorts without HIV demonstrated pathway-level convergence despite limited variant-level overlap. These findings define genetic mechanisms linking inflammasome signaling, antiviral defense, and cardiovascular risk.

5
Spatially resolved T cell receptor tracking reveals γδT cell localization to tumor-rich regions in high-risk neuroblastoma: A Report from the Children's Oncology Group

Jiang, Y.; Yu, W.; Wang, Y.; Thadi, A.; Pedersen, S.; Eagles, J.; Naranjo, A.; Collins, N.; DuBois, S. G.; Bagatell, R.; Crompton, B. D.; Tan, K.; Pugh, T. J.

2026-06-12 pediatrics 10.64898/2026.06.10.26354144 medRxiv
Top 7%
0.3%
Show abstract

High-risk neuroblastoma (HRNB) is a leading cause of pediatric cancer death. Current therapies center on intensive multimodal treatment including anti-GD2 therapy, with growing interest in harnessing T cell-mediated immunity. How T cells and their receptors (T-cell receptors, TCRs) are spatially organized and function within tumors remains poorly defined. To assess whether intratumoral location influences clonotype-specific T cell states, we profiled TCR repertoires across blood and tumor samples from 37 patients with HRNB using longitudinal bulk TCR sequencing. In a nested subset of 5 patients with paired pre- and post-therapy tumors, we integrated spatial transcriptomics with in situ TCR profiling. Across all tumors, T and B cells preferentially co-localized in immune-rich regions and showed reduced proximity to neuroblast cells. Despite this compartmentalized architecture, {gamma}{delta}T cells were more evenly distributed across tumor sections and showed greater proximity to neuroblast-rich regions than other T cell subsets. Within TCR clonotypes, spatial location was associated with distinct transcriptional states, with immune-rich regions supporting more progenitor-like programs. These findings identify spatial context as a key determinant of phenotype clonotype-specific T cell phenotype and highlight {gamma}{delta}T cells cells as a spatially distinct population with potential roles in neuroblastoma tumor-immune interactions.

6
Exploratory dried blood spot metabolomics identifies pathway-level convergence with ME/CFS biology in a self-reported PEM-like fatigue phenotype

Hauguel, P.; Anctil, N.; Noel, L.-P.

2026-06-10 rheumatology 10.64898/2026.06.08.26355197 medRxiv
Top 7%
0.3%
Show abstract

Background. Plasma and serum metabolomic studies of myalgic encephalomyelitis / chronic fatigue syndrome (ME/CFS) have repeatedly implicated hypometabolic, lipid, mitochondrial, redox and tryptophan-kynurenine pathways, but prior cohorts have been modest in size and have used heterogeneous case definitions. Whether similar pathway-level signals are detectable at scale in dried blood spots (DBS), across questionnaire-derived fatigue constructs and across orthogonal LC gradients in the same individuals remains unresolved. Methods. We profiled DBS extracts from 1,784 community-cohort adults by reverse-phase LC-MS using paired 5 min and 15 min gradients. Six questionnaire-derived endpoints captured a pragmatic self-reported PEM-like phenotype, a DSQ-derived PEM-like construct, high or review clinical status, temporal fatigue state, comorbid fatigue and self-reported chronic fatigue. The locked primary endpoint for Phase 1 was pragmatic_fatigue_pem with 226 cases and 914 controls after excluding major metabolic comorbidity. We tested a biology-first panel comprising 22 literature-curated metabolites represented by four participant-level descriptors each, and evaluated three discovery extensions: a targeted m/z search of additional literature candidates, a hypothesis-free univariate screen across 4,553 5 min and 5,625 15 min consensus features, and pairwise z-difference ratios. Endpoint-specific Ridge classifiers were evaluated by five-fold out-of-fold AUC with bootstrap stability filtering. Cross-gradient agreement was assessed by per-metabolite AUC concordance between paired 5 min and 15 min profiles. Severity was modelled as an ordinal grade derived from the number of fatigue criteria met and chronic-fatigue-form status. Results. The biology-first DBS panel achieved out-of-fold AUC 0.81 for the pragmatic self-reported PEM-like endpoint (226 cases / 914 controls). The DSQ-derived PEM-like construct reached AUC 0.60 (57 cases / 201 controls) on the un-filtered set and AUC 0.778 (SD 0.013, twenty seeds) in a post-hoc signature-decomposition follow-up restricted to participants without a self-declared major-metabolic-history tag (29 cases / 230 controls); both are treated as construct-validity anchors rather than as provoked or clinically adjudicated PEM. An optimised operationalisation of the same construct (panel-self normalisation, restriction to non-comorbid participants and demographic covariates) reached AUC 0.71 (95 % CI 0.55 to 0.76), and an exploratory age-stratified signature decomposition suggested age-dependent pathway composition that requires confirmation given small per-stratum case counts. Stable contributors mapped to carnitine-shuttle, TCA-cycle, redox-thiol and tryptophan-kynurenine pathways. Cross-gradient analysis of 22 matched metabolites yielded Pearson r = 0.62 for signed univariate effects (p = 0.002; 68 % directional agreement). The metabolomic score increased with severity grade (Spearman rho = 0.45, p = 4 x 10^-91; median scores 0.24, 0.51 and 0.75 across grades 0, 1 and 2). Sensitivity analyses on the covariate-complete subset (n = 565; 138 cases / 427 controls) showed that the DBS signal was robust to adjustment for age, sex, BMI and medication burden (DBS-only AUC 0.76, DBS plus covariates 0.78, covariates only 0.64), and produced a metabolomic-specific lift of approximately 0.13 AUC over the strongest anti-leak declarative cross-form questionnaire baseline (AUC 0.63). DBS-only AUC was stable across sex, age and BMI subgroups, and a 1:4 nearest-neighbour matched analysis on age, sex and BMI yielded AUC 0.72 (95 % CI 0.67 to 0.77). The observed pattern supported pathway-level convergence with prior ME/CFS metabolomics literature, including carnitine shuttle, fatty-acid beta-oxidation, TCA cycle, redox-thiol, urea cycle, glycerophospholipid and tryptophan-kynurenine axes. In contrast, the hypothesis-free 15 min screen produced high-AUC features that mapped predominantly to environmental or technical signals, including pesticide, industrial-amine and mobile-phase artifact annotations; only one of eight top leads, a truncated oxidised phospholipid, was biologically plausible, and none had tandem-MS support. Conclusions. In this large community cohort, a literature-curated DBS metabolomic panel captured pathway-level biology associated with a questionnaire-derived PEM-like fatigue phenotype, showed directional concordance across LC gradients, scaled with symptom severity and remained robust to key demographic, anthropometric and anti-leak questionnaire baselines. The findings converge with several metabolic axes previously reported in ME/CFS plasma and serum studies, including carnitine-shuttle, TCA-cycle, redox-thiol, urea-cycle, glycerophospholipid and tryptophan-kynurenine pathways. They should not be interpreted as clinical validation of a diagnostic test, screening tool or objective provoked-PEM biomarker. Rather, they support at-home-compatible DBS metabolomics as a biologically grounded platform for future clinically adjudicated validation, decision-support development and longitudinal monitoring in fatigue and PEM-like syndromes. Because DBS contains cellular and plasma-derived components, matrix effects must be considered when comparing individual metabolites with venous plasma or serum studies, and hypothesis-free screening at this scale can preferentially surface exposome or technical variance unless molecular identification is enforced before biological interpretation.

7
Liver biopsy confirms precise and efficient correction of SERPINA1 after in vivo Base Editing in a Patient with Alpha-1 Antitrypsin Deficiency

Krooss, S. A.; Yang, T.; Yuan, Q.; Drick, N.; Sgodda, M.; Held, J.; Behrendt, P.; Hartleben, B.; Koczulla, R.; Ma, X.; Liu, Y.; Wedemeyer, H.; Janciauskiene, S.; Di Donato, N.; Cantz, T.; Wang, E.; Wu, Y.; Hoeper, M.; Xia, Q.; Ott, M.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.01.26354551 medRxiv
Top 8%
0.2%
Show abstract

Background: Alpha-1 antitrypsin deficiency (AATD) caused by the PI*ZZ mutation (Glu342Lys) results in hepatic accumulation of misfolded AAT-Z protein and reduced circulating AAT levels, leading to progressive liver disease and emphysema. Gene correction therapy represents a potentially curative approach by directly correcting the underlying genetic defect. We report the first case of successful hepatic gene correction with early histological and functional assessment. Methods/Case presentation: We report the case of a 66-year-old male patient with PI*ZZ AATD who underwent gene correction therapy within the YOLT-202 phase I/Ia clinical trial (clinical trial.gov ID NCT07193615). Ten weeks post treatment a liver biopsy was performed to re-evaluate pre-existing F2 liver fibrosis as measured by elastography before entering the study. Serum samples allowed functional assessment of the AAT-mediated elastase inhibition. Results: Liver biopsy did not show signs of hepatic inflammation and demonstrated 54% (Sanger) and 57% (Illumina) gene correction rate of the PI*ZZ variant on the DNA level with no bystander edits or off-target effects. Following a transient elevation of transaminases during the early post-treatment period, liver enzymes normalized. Monthly serum AAT measurements demonstrated biologically active and stable therapeutic levels throughout follow-up. Conclusions: This case demonstrates efficient and precise hepatic gene correction without concerning histological alterations and with substantial improvement of functional parameters, supporting the feasibility and safety of gene editing approaches for AATD.

8
Clonal Hematopoiesis of Indeterminate Potential Refines Cardiovascular Risk Stratification in Cardiovascular-Kidney-Metabolic Syndrome Stages 0-3

Lu, J.; Sun, S.; Deng, Z.; Wang, S.; Wei, C.; Jiang, S.; Li, W.

2026-06-08 epidemiology 10.64898/2026.06.04.26354963 medRxiv
Top 8%
0.2%
Show abstract

Background: Chronic low-grade inflammation drives cardiovascular-kidney-metabolic (CKM) syndrome. Clonal hematopoiesis of indeterminate potential (CHIP), an age-related driver of systemic inflammation, is linked to several cardiometabolic disorders. However, whether CHIP modifies CKM progression and contributes to heterogeneity in cardiovascular disease (CVD) risk within the CKM framework remains uninvestigated. Methods: This cohort study included 307,025 UK Biobank participants at CKM stages 0-3 free of baseline CVD. CHIP status was identified via whole-exome sequencing (WES). The association between CHIP and baseline CKM severity was examined, along with the independent and joint effects of CHIP and CKM stages on incident CVD risk. The joint effects of CHIP and polygenic risk scores (PRS) were further assessed, and the incremental predictive value of incorporating CHIP into the AHA PREVENT equations was evaluated. Results: CHIP carriers were more likely to present with advanced CKM stages [OR 1.14 (1.09-1.20), P < 0.001] and exhibited higher incident CVD risk during follow-up [HR 1.13 (1.08-1.18), P < 0.001]. Significant joint effects between CHIP and CKM stages were observed, with the highest risk among CHIP carriers at CKM stage 3 [HR 1.63 (1.50-1.78), P < 0.001]. Large or multiple CHIP mutations conferred greater hazards, with distinct gene-specific effects observed. Moreover, CHIP and high genetic risk also jointly amplified CVD susceptibility. Most importantly, incorporating CHIP into AHA PREVENT significantly improved risk discrimination. Conclusions: CHIP is a significant risk factor associated with more advanced CKM stages and amplifies incident CVD risk. Integrating CHIP into existing prevention strategies may refine CVD risk stratification.

9
Physical activity, fatty acids, and MASLD risk: Behavioural and metabolic factors jointly shaping liver health in populations

Chen, F.; You, R.; Liu, Y.; Yin, Y.; Liu, A.; Deng, L.; Xie, B.; Fan, J.; Wang, W.

2026-06-08 epidemiology 10.64898/2026.06.05.26354982 medRxiv
Top 11%
0.2%
Show abstract

Background and Aims: MASLD has become the most prevalent chronic liver disease globally. Although MVPA and plasma fatty acids have been individually studied in relation to metabolic health, their independent and combined associations with MASLD incidence remain unclear. We aimed to investigate these associations. Methods: This study included 51,717 UK Biobank participants free of liver disease at baseline, with MVPA measured using wrist-worn accelerometers and plasma fatty acids quantified via NMR. Multivariable-adjusted Cox models and restricted cubic splines were used. Results: Over a median follow-up of 7.8 years, 472 incident cases were identified. In fully adjusted models, meeting recommended MVPA levels together with higher n-6 PUFA concentrations was associated with a 71% lower risk (HR 0.29, 95% CI 0.18-0.45). The MVPA-MASLD association was nonlinear, with risk reduction plateauing at approximately 189 minutes per week. Higher n-6 PUFA was associated with reduced risk, whereas n-3 PUFA showed no significant association. Conclusions: These findings suggest that behavioral and metabolic factors may jointly influence MASLD risk. Further studies in diverse populations are needed to confirm these associations.

10
Beyond event-rate enrichment: proteomic risk scores for mechanism-aware prevention trial design

Fieggen, J.; Simond, G.; Segal, B. M.; Noori, A.; Thakurta, A.; Butler, C. C.; Clifton, D. A.; Clifton, L.

2026-06-10 health informatics 10.64898/2026.06.09.26355266 medRxiv
Top 13%
0.1%
Show abstract

Background. Blood-based biomarkers are increasingly proposed for identifying high-risk individuals before clinical disease and for making prevention-oriented trials more efficient. Prognostic enrichment can increase event rates, but trial efficiency also depends on whether the intervention effect is preserved in the enriched population. Methods. Using the UK Biobank Pharma Proteomics Project, we trained disease-specific proteomic risk scores (ProRS) from 2,916 plasma proteins with elastic-net Cox models. We compared ProRS, polygenic risk scores (PRS), and combined PRS--ProRS scores across ten incident diseases. We estimated cumulative incidence and theoretical two-arm time-to-event trial sample sizes across risk strata. To evaluate effect preservation, we examined six intervention-analogue exposure--outcome pairs spanning genetic (PCSK9/coronary artery disease, APOE/Alzheimer's disease, PPARG/type 2 diabetes, IL23R/Crohn's disease), behavioural (physical activity/all-cause mortality), and pharmacological (RAAS inhibitors versus calcium channel blockers/coronary artery disease) examples. Results. ProRS outperformed PRS for 9 of 10 diseases (median C-index 0.75 versus 0.61). ProRS and PRS were weakly correlated (median Pearson |r| = 0.04), and joint PRS--ProRS stratification identified groups with higher observed incidence than either score alone for several endpoints. In the top risk quartile, combined-score enrichment reduced theoretical required sample sizes by 32--74\% under a fixed 20\% relative hazard reduction. These gains were not always preserved when stratum-specific intervention-analogue effects were used. Effects were broadly preserved for APOE/Alzheimer's disease and physical activity/mortality. The PPARG/type 2 diabetes effect attenuated toward the null under all three score types, showing that event-rate enrichment does not guarantee effect preservation. For IL23R/Crohn's disease and the antihypertensive comparison, point estimates differed across score types -- preserved under polygenic but attenuated under proteomic enrichment -- but confidence intervals were wide and overlapping. Conclusions. Proteomic risk scores can identify high-event-rate populations for prevention-oriented trials, but event-rate enrichment alone is insufficient for trial design. Biomarker-guided enrichment should evaluate mechanism-specific effect preservation and may be preferable as a stratification or adaptive-design variable rather than as a restrictive eligibility criterion.

11
Global Health Injustice From Climate Change Driven By Consumption

Rupcic, L.; Yoo, D.; Levasseur, A.; Alexandre, C.; Laurent, A.; Jolliet, O.

2026-06-12 public and global health 10.64898/2026.06.10.26355381 medRxiv
Top 15%
0.1%
Show abstract

Climate change imposes unequal health burdens from heat and cold, disproportionately harming vulnerable nations least responsible for emissions. A framework to quantitatively attribute this damage to different countries' consumption patterns has been missing. We developed a global framework linking consumption-based greenhouse gas emissions to country-specific health burdens, measured in Disability-Adjusted Life Years (DALYs). Our results quantify the profound scale of this externalized harm. For example, average North American consumption imposes a global health burden of 34 days of healthy life per person per year, without net damage suffered. In contrast, Sub-Saharan Africa endures 25 days per person per year despite minimal emissions. The resulting Health Injustice Index provides a powerful instrument for climate accountability, reframing responsibility in terms of tangible human health impacts.

12
Topological Deep Learning Identifies Polygenic Variant Clusters Across Familial Multimorbid Disorders

Vomo-Donfack, K. L.; Bousquet, G.; Falgarone, G.; Ginot, G.; Morilla, I.

2026-06-09 health informatics 10.64898/2026.06.03.26354242 medRxiv
Top 16%
0.1%
Show abstract

Whole-genome sequencing comprehensively captures coding, non-coding and structural variation in families with suspected inherited disorders, yet its clinical utility remains constrained by an interpretation bottleneck: selecting a handful of relevant variants from millions of candidates. Current rule-based pipelines, anchored in ACMG/AMP criteria, excel at identifying highly penetrant Mendelian alleles but frequently miss variants of low-to-moderate penetrance, non-coding alterations and germline-somatic interactions. Here we introduce PolyCLIP-T, a topology-guided multimodal framework that transforms variant selection from a classification problem into a geometric discovery task. By contrastively aligning DNA-sequence embeddings with functional annotations, PolyCLIP-T constructs a unified latent space in which the displacement between reference and alternate embeddings quantifies the molecular perturbation induced by each variant. Persistent homology then identifies stable topological components - coherent variant groups shared among affected relatives - that transcend single-variant scoring logic. Applied to six families with multi-morbid cancer, autoimmune and cardiovascular disease, PolyCLIP-T recovered non-coding and structural candidates overlooked by conventional pipelines and revealed pleiotropic networks spanning disease categories. This approach provides an interpretable, scalable solution for genome-first investigations of disorders driven by polygenic architectures that evade single-variant analysis. The framework was developed and benchmarked on deeply characterised familial cohorts selected for transgenerational multimorbidity; validation in larger, independent populations will be essential to establish its generalisability. An interactive web tool is freely available at https://www.polyclip-t.uma.es/.

13
Polygenic risk scores associate with asthma phenotypes and proteomic analyses implicate IL1R1 in two family-based studies

Lee, S.; Moll, M.; Mendez, K.; Prince, N.; Lasky-Su, J.; Lutz, S. M.; Weiss, S. T.; Lange, C.; Kelly, R. S.; Hecker, J.

2026-06-11 genetic and genomic medicine 10.64898/2026.06.06.26355045 medRxiv
Top 16%
0.1%
Show abstract

Despite its high prevalence and the discovery of hundreds of genetic associations, the genetic determinants and heterogeneous manifestations of asthma remain incompletely understood. Incorporating polygenic risk scores (PRS) into asthma research offers a powerful approach to quantify inherited susceptibility, refine risk profiles, and advance mechanistic understanding of disease development. For this study, we leveraged whole-genome sequencing (WGS) data from two family-based cohorts of childhood asthma - the Genetics of Asthma in Costa Rica Study (GACRS) and the Childhood Asthma Management Program (CAMP) - to examine the transmission profiles of externally derived asthma PRS and their associations with clinical phenotypes in children with asthma. To further elucidate molecular mechanisms, we integrated large-scale external genome-wide association study (GWAS) summary statistics and genetic prediction models of protein abundance in a two-step proteome-wide association study (PWAS) of asthma. Our findings provide robust evidence supporting the validity of externally derived asthma PRS (asthma PRS association p-value p={10}^{-24} [GACRS and CAMP trios combined] for the Global Biobank Meta-analysis Initiative [GBMI]) and reveal consistent associations with spirometry measures and atopy markers across both studies, as 13 of 21 traits (62%) were significantly associated with the GBMI-PRS in the meta-analysis after multiple-testing correction. Moreover, the results of the integrative proteomic analysis implicate IL-1 signaling in the etiology of asthma, reinforcing the candidacy of IL1R1 antagonists for drug repurposing.

14
Rare neurological and neurodevelopmental variants in ALS link to onset, survival and family history

O'Donoghue, C.; Kacar, E.; Gomes, T.; Costello, E.; Pender, N.; Peelo, C.; Ryan, M.; Heverin, M.; Byrne, S.; Bede, P.; Hardiman, O.; McLaughlin, R. L.; Byrne, R. P.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26354977 medRxiv
Top 17%
0.1%
Show abstract

Background: Neurological, neuropsychiatric, and neurodevelopmental disorders cluster in ALS families, sharing a common genetic architecture with ALS. Pathogenic variants in genes associated with other neurological, neurodevelopmental, or neuropsychiatric disorders may also co-occur in ALS and modify phenotype. We have sought to determine the prevalence and clinical pattern of likely-pathogenic/pathogenic (LP/P) non-ALS neurological, neurodevelopmental, and neuropsychiatric variants, alone and in combination with ALS-gene variants, in two large ALS cohorts. Methods: Whole-genome sequencing (WGS) of 469 Irish and 774 Answer ALS people with ALS (pwALS) was analysed for ClinVar LP/P variants associated with other neurological (n = 15541), neurodevelopmental (n = 9761), and neuropsychiatric (n = 321) phenotypes. Inheritance patterns for associated genes (autosomal recessive/autosomal dominant) along with the associated phenotype were validated using OMIM. Standardised clinical data included family history, site and age of onset, El Escorial category, survival, motor decline, and cognitive and behavioural assessments. Known ALS-gene variants and C9orf72 repeat expansion status were included for each cohort. Results: Non-ALS neurological variants were identified in 47/469 (10.0%) Irish and 69/774 (8.9%) Answer ALS participants, most frequently in hereditary spastic paraplegia-associated genes (3.2% Irish; 2.8% Answer ALS). Irish neurological variant carriers showed higher frequency of respiratory onset (10.6% vs 1.2%, Fisher's exact p = 0.002, {Phi} = 0.20) and fewer premorbid behavioural symptoms (0.92 +/- 0.56 vs 3.08 +/- 0.97, Cohen's d = -0.40). Neurodevelopmental variants occurred in 12/469 (2.6%) Irish and 20/774 (2.6%) Answer ALS participants. In the Irish cohort, neurodevelopmental variant carriers had significantly shorter survival in Cox proportional hazards model (log-rank p = 0.005), corresponding to a more than two-fold increased hazard of death (HR = 2.25, 95% CI 1.26-4.00), and had significantly increased familial burden of neuropsychiatric disorders among first- and second-degree relatives (negative binomial IRR for carriers = 2.41, 95% CI: 1.12-5.18, p = 0.025). Across combined cohorts, 18 individuals (Irish n = 8; Answer ALS n = 10) carried [&ge;]2 LP/P variants spanning ALS and non-ALS genes. Conclusion: Rare LP/P variants in genes associated with other neurological and neurodevelopmental disorders occur in up to 12% of pwALS across two independent cohorts. Carriers show distinct phenotypes, shorter survival, and characteristic family history patterns. These findings suggest that extended pleiotropic and oligogenic architectures may contribute to ALS heterogeneity.

15
Disentangling Confounders from Pathology in Long-COVID Trajectory Prediction for Women: An Interpretable Large-Language-Model Approach

Wang, J.; Galis, Z.; Zhang, T.; Luo, Y.; Sra, A.; Niu, X.; Shen, J.; Xie, Q.; Weiss, J. C.

2026-06-12 infectious diseases 10.64898/2026.06.10.26355420 medRxiv
Top 19%
0.1%
Show abstract

Objective. Post-acute sequelae of SARS-CoV-2 infection (PASC, "Long COVID") dispropor- tionately affects women, in whom hallmark symptoms--insomnia, fatigue, palpitations, cogni- tive difficulty--overlap with comorbidities and hormonal transitions such as menopause. This diagnostic overlap is a confounding problem: models that forecast future symptom severity risk attributing baseline physiological noise to viral pathology. We ask whether an interpretable, causally disentangled language model can separate true pathological signal from such con- founders while remaining competitive with strong predictors of future PASC severity

16
Analytical Centralization of Health Expenditure at the National Administrator of Health System Resources: Architecture, Data Quality, and Operational Performance of the ADRES Health System Analytics Platform, Colombia

Garavito Jimenez, D. A.; Bello Angulo, D. E.; Mejia Lemus, L. T.; Chipatecua, D.; Fula, D. D.; Perez-Rubiano, S.; Martinez, F. L.; Bohorquez Pinzon, J. C.

2026-06-10 public and global health 10.64898/2026.06.08.26355159 medRxiv
Top 21%
0.1%
Show abstract

Between 2024 and 2025, Colombia universalized the Electronic Health Invoice with embedded Individual Health Services Delivery Records (RIPS -- Registro Background Between 2024 and 2025, Colombia universalized the Electronic Health Invoice with embedded RIPS records (FEV-RIPS) as the standard for financial and clinical data exchange. ADRES -- the entity responsible for administering the resources of Colombia's General Social Security Health System -- faced the challenge of processing information from multiple heterogeneous sources generated by more than 55,000 healthcare providers. Health systems in high-income countries converge clinical-financial data in consolidated platforms; Colombia started from a fragmented architecture with incompatible historical sources, no cross-database standardization, and no centralized analytical infrastructure until 2023. Objective We describe the design, technical challenges of integrating heterogeneous data, and operational performance of the analytical infrastructure built by ADRES to centralize large-scale processing of Colombian health system information, and derive transferable lessons for health system resource administrators in Latin America facing equivalent digitalization mandates. Methods Technical-descriptive report based on operational metrics from the ADRES Azure/Databricks environment during January-November 2025. We report indicators of data volume, processing speed, computational capacity, concurrent use by functional group, and governance structure. The architecture integrates VPN connectivity with MinSalud, automated processing of multiple formats (XML, relational tables, flat files), and a medallion data lake (Bronze/Silver/Gold). Data quality challenges include structural inconsistencies across sources, coding incompatibilities (municipalities, dates, diagnoses), format heterogeneities in unstructured data, and absent technical documentation. Results The platform manages 21 catalogs, 1,183 tables, and over 110,645 million stored records, with cumulative production exceeding 1 trillion processed records. It executes queries on 100 billion records in ten seconds using clusters of up to 32 TB RAM and 4,096 vCPU. During September-October 2025, monthly query peaks reached 78,028 across eleven functional groups. Integration required Python/PySpark parsers for variable-depth XML, equivalence tables for incompatible municipality codes, cleaning routines for extreme dates used as nulls (1900-01-01, 9999-12-31), and transformation logic bridging classic RIPS and FEV-RIPS. The platform supported econometric analyses, judicial mandate responses, and public interactive dashboards. Conversational AI integration (Genie, Copilot) extends analytical access to users without SQL knowledge. Conclusions ADRES built in one year an analytical infrastructure that provides, to our knowledge, the first published documentation of the systemic technical challenges of integrating heterogeneous data sources in a middle-income social security health system. Centralizing health system information at national scale is technically feasible under public institutional constraints -- but requires solving cross-source standardization problems the implementation literature does not document with quantitative precision. The derived lessons are transferable to health system resource administrators in Latin America facing equivalent challenges.

17
Neonatal mortality risk of large-for-gestational age and macrosomic live births in low- and middle-income subnational birth cohorts: An individual participant meta-analysis (2000-2017)

Kirakoya Samadoulougou, F.; Barche, B.; Ukwishaka, J.; Subedi, S.; Erchick, D. J.; Suarez Idueta, L.; Hamer, D. H.; Semrau, K. E. A.; Hamomba, F. M.; Banda, B.; Manasyan, A.; Pry, J. M.; Maleta, K.; Ashorn, U.; Schmiegelow, C.; Hjort, L.; Minja, D. T. R.; Lusingu, J. P. A.; Freitas da Silveira, M.; Buffarini, R.; Baqui, A. H.; Khanam, R.; Ahmed, S.; Zhu, Z.; Zeng, L.; Cheng, Y.; Lachat, C.; Roberfroid, D.; Huybregts, L.; Toe, L. C.; Tielsch, J. M.; Khatry, S. K.; Mullany, L. C.; Ohuma, E. O.; Blencowe, H.; Katz, J.; Lee, A. C. C.; Black, R. E.; Hazel, E. A.

2026-06-06 public and global health 10.64898/2026.06.03.26354851 medRxiv
Top 22%
0.1%
Show abstract

Background Large-for-gestational-age (LGA) and macrosomic newborns are at increased risk of adverse perinatal outcomes, including death, yet the burden of neonatal mortality associated with these conditions in low- and middle-income countries (LMICs), where ongoing nutritional and epidemiological transitions suggest their prevalence will rise, remains poorly quantified. In this study, we quantify the neonatal mortality risk associated with LGA and macrosomia from 16 subnational birth cohorts in low- and middle-income countries between 2000 and 2017. Methods and findings This is an individual-participant meta-analysis to estimate neonatal mortality rates (NMRs) and relative risks among LGA infants (>90th and >97th percentile birth weight-for-gestational-age using INTERGROWTH-21st) versus appropriate-for-gestational-age (AGA, 10th-90th percentile) infants. Macrosomic ([&ge;]4000 g and [&ge;]4500 g) neonates were compared with those weighing 2500 g-3999g. Missing birth weights were imputed using recalibration and multiple imputation methods. We used random effects meta-analysis to pool relative risks. Median prevalences of LGA >90th and >97th percentile were 5.3% (interquartile range 3.6-8.2) and 2.6% (IQR 1.3-4.5), respectively; macrosomia ([&ge;]4000 g and [&ge;]4500 g) prevalences were 1.0% (IQR 0.3-3.1) and 0.06% (IQR 0.0, 0.30), respectively. Mortality was highest among preterm plus LGA infants (61.3 per 1000). LGA infants in the >90th percentile had over twofold increased mortality compared with appropriate-for-gestational-age infants (RR: 2.46; 95% CI: 1.86-3.25), while >97th percentile infants had a higher risk (RR: 3.77; 95% CI: 2.50-5.69). Term LGA >97th percentile infants also showed elevated mortality (RR: 3.14; 95% CI: 1.58-6.22). For LGA >97th percentile, the risk was higher in the early neonatal period (RR: 2.71; 95% CI: 1.92-3.82) than late (RR: 1.69; 95% CI: 1.22-2.34). There was no overall association between macrosomia ([&ge;]4000 g) and neonatal mortality. Population attributable fractions were 7.2% for LGA >90th percentile and 0.4% for macrosomia ([&ge;]4000 g). Conclusions Neonatal mortality risks were elevated among LGA infants in low- and middle-income countries, particularly at extreme values (>97th percentile) and during the early neonatal period. Macrosomia showed weaker, less robust associations. Although LGA prevalence is currently low ([~]5%) and contributes less to neonatal mortality than small newborns, ongoing nutritional and epidemiological transitions suggest increasing prevalence. This highlights the need for strengthened surveillance, monitoring, and improved delivery planning to ensure that no population is left behind.

18
Socio-demographic Correlates of Prolonged Amenorrhea and Menopausal Transition among Nigerian Women Aged 30-49: Evidence from the 2024 Nigeria Demographic and Health Survey

Ogunsemoyin, O.; Ayinmoro, A. D.

2026-06-09 public and global health 10.64898/2026.06.06.26355063 medRxiv
Top 23%
0.0%
Show abstract

Introduction Menopause is a central marker of reproductive ageing, but national evidence on menstrual cessation among Nigerian women in the late reproductive ages remains limited. This study examined the prevalence and socio-demographic correlates of prolonged amenorrhea/possible menopausal transition among Nigerian women aged 30-49 years. Methods The study used the women's individual recode file from the 2024 Nigeria Demographic and Health Survey. The analytic sample was restricted to women aged 30-49 years, excluding women who were currently pregnant, currently or postpartum amenorrheic, and those with invalid or special responses on time since last menstrual period. The final sample comprised 14,223 women. The outcome combined women whose last menstrual period occurred 12 or more months before the survey, and women reported as being in menopause. Weighted descriptive statistics, design-adjusted bivariate tests and survey-weighted binary logistic regression were used. Results The weighted prevalence of prolonged amenorrhea/possible menopausal transition was 7.6%. Prevalence rose from 1.2% among women aged 30-34 years to 23.6% among women aged 45-49 years. In the adjusted model, women aged 35-39 years (OR=1.64; p=0.030), 40-44 years (OR=6.20; p<0.001) and 45-49 years (OR=24.51; p<0.001) had higher odds than women aged 30-34 years. Primary education (OR=1.65; p=0.004), middle wealth status (OR=1.37; p=0.043) and poorest wealth status (OR=1.60; p=0.024) were associated with higher odds. Muslim affiliation (OR=0.72; p=0.024) and traditional contraceptive use (OR=0.24; p<0.001) were associated with lower odds. Conclusion Prolonged amenorrhea/possible menopausal transition among Nigerian women aged 30-49 is strongly age-patterned and socially differentiated. The findings support the need to make midlife menstrual health more visible within reproductive, family planning and primary healthcare services. Because the measure is based on survey-reported menstrual recency, it should not be interpreted as clinically confirmed natural menopause.

19
Epidemiology of Cervical Precancerous Lesions: Prevalence and Predictors from Pap Smear Screening in Hawassa City Hospitals, Sidama Region, Ethiopia. Institutional-Based Cross-sectional Study

Fisshatsion, A. B.; Zewude, Y. A.; Nisro, A. M.; Abebe, R. F.

2026-06-10 public and global health 10.64898/2026.06.09.26355254 medRxiv
Top 23%
0.0%
Show abstract

Background: Cervical cancer is the fourth most common cancer in women worldwide and remains a major public health challenge. In Ethiopia, it is the second leading cause of cancer deaths, with around 8,000 new cases and 6,000 deaths each year. Region?specific data on the prevalence and predictors of precancerous lesions remain scarce, yet such information is vital for guiding targeted reproductive health strategies. This study therefore examined the prevalence and predictors of cervical precancerous lesions among women aged 21-60 years undergoing Pap smear screening in public hospitals in Hawassa City, Sidama Region. Methods: An institution-based cross-sectional study was conducted among 241 women attending Pap smear screening at public hospitals in Hawassa City from March to August 2025. Sociodemographic and clinical data were collected via interviews and medical records. Lesions were classified based on the standardized international framework for reporting cervical cytology results from Pap smears per the Bethesda system. Multivariable logistic regression identified predictors p<0.05). Result: Of 241 women screened (mean age 35.3 years), cervical epithelial abnormalities were detected in 52 (prevalence 21.6%). Atypical squamous cells of undetermined significance was the most common abnormality (16.6%). Multivariable analysis showed HIV infection was significantly associated with precancerous lesions (AOR = 3.7, 95% CI: 1.69-8.12, p<0.05), while hormonal contraceptive use was protective (AOR = 0.27, 95% CI: 0.11-0.67, p<0.05). Conclusion: These results underscore the urgent need to strengthen cervical cancer prevention through targeted screening and early intervention. Integrating routine HIV testing with Pap smear programs would be especially valuable. Health authorities should expand accessible screening for women aged 21-60, with particular attention to those living with HIV, to help reduce the burden of precancerous lesions.

20
Plasma protein prioritisation in rheumatoid arthritis reveals druggable targets and shared biology with cardiovascular diseases

Alduhayhi, S. S.; Morris, A. P.; Zhao, S.; Bowes, J.

2026-06-11 epidemiology 10.64898/2026.06.10.26355332 medRxiv
Top 24%
0.0%
Show abstract

Abstract Background Rheumatoid arthritis (RA) is an autoimmune inflammatory disease with complex and incompletely understood molecular mechanisms. Understanding circulating proteins associated with RA may improve understanding of disease biology and clarify its pathological links with cardiometabolic comorbidities. Methods A proteome-wide two-sample Mendelian randomisation (MR) drug target analysis was conducted using plasma proteins measured in 54,219 participants from the UK Biobank Pharma Proteomics Project as exposures and RA and cardiometabolic diseases as the outcomes. Summary statistics for RA included 53,663 cases and 1,070,200 controls. Colocalisation analysis was performed to confirm shared single causal variants and prioritise RA proteins supported by both MR and colocalisation. The prioritised proteins were then evaluated in the Accelerating Medicines Partnership RA Phase II synovial single-cell dataset for cell-type expression patterns. Druggability was then assessed followed by analysis of genetic overlap between RA-associated proteins and cardiometabolic diseases. Results 37 plasma proteins had a causal effect on RA risk, supported by combined evidence from MR and conditional colocalisation. In synovial tissue, TPPP3, RARRES2, AKAP12, and GGT5 were predominantly expressed in stromal and endothelial cell clusters. Druggability assessment identified IFNGR2, IL6R, CD40, and FCGR2B as Tier 1 targets. However, several biologically relevant proteins, including RARRES2, AKAP12, TPPP3, and SNX2, had limited available druggability data. Genetic overlap analysis demonstrated shared protein signals between RA and cardiovascular diseases, including overlap of RARRES2 and TPPP3 with coronary artery disease (CAD) and FCGR2B with atrial fibrillation (AF). To approximate the therapeutic effect of target inhibition, the direction of effect estimates for proteins showing overlap between RA-CAD and RA-AF was reversed. Conclusion This study identified circulating proteins involved in RA pathogenesis and reveals shared mechanisms between RA and cardiovascular diseases. While some proteins showed clear translational potential targets, several prioritised proteins had limited available druggability information and could not be confidently classified. Addressing these gaps may help identify new targets relevant to RA management. Future work should also use phenome-wide MR studies to evaluate potential on-target adverse effects of protein inhibition across RA-CAD and RA-AF.